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Listing of Claims: 

L (Currently amended) A methpd in a data processing system for generating and storing 
in a database an entry, the method comprising the steps of: 
generating an entry comprising: 

i) data identifying a moIe^s ; 

ii) data identifying at least one region in the molecule: and 

in) a set of axes derived from property distribution information of the at least one region, 
the set of axes characterizing the at least one region: 

generating at least one descriptor vector for the at least one region: 

applying a mapping to the at least one descriptor vector associated with the at least one 
region to construct a key based on preselected criteria: and 

storing the entry in a memory, wherein the key is associated with the entry such that tfte 
key indexes the entry for retrieval thereof, 

- 4n-a data proc e s s ing syst e m wh e r e in - deocriptor vootors associated with a - plurality of 

r e gions of mol e cul e s or e- stored in a database; a m e thod for g e n e rating and storing data 
characterising at looot ono rogtea - ef said plurality of r e gions, th e m e thod compri s ing th o st o ps 

Off 

— — g e n e rating an entry comprising i) an id e ntifier that idontifioo paid at l e ast on e r e gion, and 
ii) - data - characterizing a oot of axoo d e riv e d from a property distribution of said at least ono 
r e gion; 

— : applying a mapping - to the d e ocriptor voctor - associated with said at l e ast one region based 

on preselected criteria; 

generating a key that corresponds to said mapping of th e d e scriptor v e ctor aoGooiatod 

with s aid at loast ono region; and 

storing said entry in a memory, whoroin said koy is associated with said ontry guoh that 

th e k e y indexes th e- e ntry for retrieval thoroof. 

2. (Cancelled) 
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3. (Cancelled) 

4. (Cunently amended) The method of claim 1, wherein said the property 
distribution information of said the at least one region is computed from a convolution with a 
probe function to a property field. 

5. (Currently amended) The method of claim J, wherein ea*d the at least one 
plurality of descriptor vectors are is classified into groups, and wherein said the mapping step 
maps said the at least one descriptor vectors to a space discriminating between said groups of 
descriptor vectors. 

6. (Currently amended) The method of claim 5, wherein sad the mapping is derived 
from the steps of: 

generating first data representing differences between said groups of descriptor vectors; 
generating second data representing variations within said groups of descriptor vectors; 

identifying a set of component vectors that maximizes a ratio of variations between 
groups to the variations within groups along the component vectors as a discriminant criterion 
function an F distributed critorion function, said - criterion function having a numerator base d 
upon coid firot data and a denominator bas e d upon said s e cond data ; 

generating a criterion function for subsets of the component vectors, wherein the criterion 
function utilizes the first data and the second data on F distribut e d statistic for subsets of said 
component vectors, said -s tatistic? having a num e rator bas e d upon said first data and a 
denominator basod upon said sooond data ; 

for each particular subset of said-component vectors, calculating a probability value for 
the F - di s tribut e d statistio criterion functions associated with the particular subset; 

selecting a probability value from probability values for said the subsets of said 
component vectors based upon a predetermined criterion; 
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identifying the subset of said-component vectors associated with the selected probability 
value; and 

generating a mapping to a space corresponding to the subset of said-component vectors 
associated with the selected probability value, and storing the mapping for subsequent 
processing. 

7. (Currently amended) The method of claim 6, wherein said the first data comprises 
a matrix representing covariance between said the groups of descriptor vectors, and said the 
second data comprises a matrix representing covariance within said the groups of descriptor 
vectors. 

8. (Currently amended) The method of claim 7, wherein said the criterion function 
has the general form: 

f(0>=: C 

V w w s 

where is some vecto r, T indicates a transpose. &j is a first data representing covariance. S u, 
is a second data representing covariance and C is a constant based upon degrees of freedom in 
£b and 

9. (Currently amended) The method of claim 8, wherein the variable C is determined 
as follows: 

1/(1 ru-N) 
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Where N represents the number of groups of descriptor vectors, "< represents the number of 
regions, and X »/ represents the sum of "/ for the N groups. 

1 0. (Currently amended) The method of claim 7, wherein the step of identifying a set 
of component vectors that maximizes an F distribut e d F-distributed criterion function comprises 
the substeps of: 

determining a set of (eigenvalue, eigenvector) pairs for the matrix € w ; and 
determining said the set of component vectors based upon said the set of (eigenvalue, 
eigenvector) pairs for the matrix Gw. 

11. (Currently amended) The method of claim 10, wherein said statistic the F^ 
distributed statistic for a given subset of component vectors is based upon value of said the 
criterion function for said the subset of component vectors. 

12. (Currently amended) The method of claim 11, wherein said statistic the re- 
distributed statistic for a given subset of component vectors has the following form: 

where fk represents the value of the criterion function at a component vector in the given subset, 
C is a constant, Ls represents the number of fk values in the given subset of component vectors, 
and the Z operation sums over the ^.v A values in the given subset of component vectors. 
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13. (Currently amended) The method of claim 12, wherein odd a the probability 
value for a particular F-distributed statistic represents a probability value that the particular F- 
distributed statistic could have been larger by chance. 

14. (Currently amended) The method of claim 13, wherein said th£ probability value 
selected from probability values for said the subsets of component vectors is a minimum 
probability value of sa& the probability values for said the subsets of component vectors. 

15. (Currently amended) The method of claim 6, wherein said ih§ mapping for said 
the at least one descriptor vector performs a loop over each component vector belonging to the 
subset of component vectors associated with the selected probability; 

wherein, in each iteration of said £he loop, dot product of said the descriptor vector with a 
transpose of a unit vector for the given component vector is added to a running sum. 



16. 


(Cancelled) 


17. 


(Cancelled) 


18. 


(Cancelled) 


1?. 


(Cancelled) 


20. 


(Cancelled) 


21. 


(Cancelled) 


22. 


(Cancelled) 


23. 


(Cancelled) 


24. 


(Cancelled) 



Docket No. Y0998-1 12 6 Serial No. 09/275,568 



1H9IN)H * QNb~l~lOH dd Wd PC = 2 t?002 B2 130 



8j-C0:(ss-ujLii) NOIlViJna , :Q]SD * 90C6^3:S[NQ , 9/1- ^Xd3^01dSn:^S , ^^.1 Luajs^s] Wd LE;:3t^ W03/8^'0t IV QAOd * Cl/6 30Vd 



25. (Cancelled) 

26. (Cancelled) 

27. (Cancelled) 

28. (Cancelled) 

29. (Cancelled) 

30. (Cancelled) 

31. (New) The method of claim 1, wherein the at least one descriptor vector is 
invariant to rotation and translation of the at least one region. 

32. (New) The method of claim 3 1 , wherein the set of axes is derived from principal 
axes of second moments of a region of the property distribution information. 

33. (New) The method of claim 6, wherein the probability value is obtained by 
treating the ratio as an F-distributed statistic. 

34. (New) The method of claim 6, wherein the probability value is obtained by any 
one of cross-validation, jack-knife and bootstrap estimations. 

35. (New) The method of claim 6, wherein application in constructing the 
discriminant criterion function includes boosting and bagging techniques. 
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